Distributed signaling in a virtual machine cluster

ABSTRACT

Technology for sharing data among multiple virtual machines in a cluster of virtual machines is disclosed. Each virtual machine identifies “managed” objects of an instance of an application running at the virtual machine. Operations performed by an instance of one application which affect the state of managed objects are detected and distributed. Technology is included for distributing signaling between threads on different virtual machines. The technique extends existing language semantics, including “synchronized”, “wait” and “notify”, “thread.join” and network call methods to an entire cluster of virtual machines.

CROSS REFERENCE TO RELATED APPLICATION

This application claims the benefit of U.S. provisional patentapplication No. 60/684,610, filed May 25, 2005, titled “TerracottaVirtualization Server”, and incorporated herein by reference.

The application is also related to the following co-pendingapplications, each of which is incorporated herein by reference:

(1) U.S. patent application Ser. No. ______, filed ______, titled“Clustering Server Providing Virtual Machine Data Sharing”, docket no.TERA-01000US1;

(2) U.S. patent application Ser. No. ______, filed ______, titled“Clustered Object State Using Synthetic Transactions”, docket no.TERA-01002US0.

(3) U.S. patent application Ser. No. ______, filed ______, titled“Distributed Object Identity in a Virtual Machine Cluster”, docket no.TERA-01003US0;

(4) U.S. patent application Ser. No. ______, filed ______, titled“Clustered Object State Using Field Set Operations”, docket no.TERA-01004US0;

(5) U.S. patent application Ser. No. ______, filed ______, titled“Clustered Object State Using Logical Actions”, docket no.TERA-01005US0;

(6) U.S. patent application Ser. No. ______, filed ______, titled “LockManagement for Clustered Virtual Machines”, docket no. TERA-01012US0;

BACKGROUND

Application developers have traditionally faced a number of challengesin horizontally scaling applications to multiple servers. Scaling isparticularly useful to World Wide Web application developers who may,for example, require geographically distributed application servers toprovide users with better performance. In one example, suppose a user ofa web-based application logs on to a web site to change information inan existing user account. Typically, in a distributed application, oneapplication server is selected to handle the transaction based on itsgeographical location, availability or other factors. The selectedserver accesses the account data and makes the requested changes locallyand the updated data must then be shared with the other servers so thatthe user's future interactions with any of the servers will reflect theupdates. Additionally, the fact that some servers may go offline whileothers come online must be considered.

This scaling challenge is faced by developers in many developmentenvironments, including developers using the popular Java developmentplatform. The Java platform's goal in providing a platform independentenvironment is generally met by the fact that Java source code iscompiled into an intermediate language called “bytecode,” which canreside on any hardware platform. In order to run the bytecode, it mustbe compiled into machine code via a Java Virtual Machine (JVM). A JVM isa platform-independent execution environment that converts Java bytecode into machine language and executes it. The JVM provides thedeveloper with the tools necessary for multi-threaded applications,including thread support, synchronization and garbage collection.

FIG. 1A illustrates a traditional implementation of a Java applicationrunning on a virtual machine under a given operating system on aprocessing system or server. As developers have attempted to scale Javaapplications to multiple processing systems, difficulties in maintainingobject and primitive states across the systems become more numerous.

Traditionally, application developers themselves have been required toaccount for scaling using common forms of inter-server communication inorder to share objects amongst distributed JVMs. One form ofcommunication is Remote Method Invocation (RMI), which is a set ofprotocols that enables Java objects to communicate remotely with otherJava objects. Another form of communication is the Java Message Service(JMS), which is an Application Program Interface (API) for accessingenterprise messaging systems. JMS supports both message queuing andpublish-subscribe styles of messaging. Java Temporary Caching (JCache)is a distributed caching system for server-side Java applications.

While each of these techniques allow the developer the flexibility toadd scaling to their application, the conventional techniques requireapplication code modified, resulting in significant added complexity anddevelopment costs. Further, the conventional techniques limitscalability of the application tier, are often quite slow, and tend toabuse database infrastructure for transient needs. Finally, the task ofmaintaining object identity is a challenge as multiple instances ofobjects can be created at the different application servers.

An improved technology is needed for maintaining consistent data acrossvirtual machines.

SUMMARY

The technology herein, roughly described, provides a technique forsharing data among multiple virtual machines in a cluster of virtualmachines.

Data sharing functionality is provided to application software which wasdesigned for use on a single virtual machine. In one approach, a datasharing agent includes a lock manager, a transaction manager, an objectmanager, and a communication manager. A central manager, which may beprovided on another server, interacts with virtual machines in thecluster to facilitate sharing so that object state is maintainedconsistently on all virtual machines. The central manager includes anobject manager, a lock manager, transaction manager, communicationmanager, and a persistence manager.

In one aspect, the invention includes distributed signaling betweenthreads on different virtual machines. Here, a technique is provided forsignaling threads on different VMs to coordinate the starting andstopping of threads. The technique extends existing language semantics,including “synchronized”, “wait” and “notify”, and network call methodsto an entire cluster of virtual machines.

In one aspect, the invention is a method for distributing threadsignaling amongst virtual machines in a cluster. The method includes thesteps of: receiving a signal from a first virtual machine indicating amethod call on a method in a first thread which is dependent on anaction of a second thread; monitoring at least the second thread forcompletion of the action; and providing a signal to the first thread onthe first virtual machine indicating the action of the second thread;wherein the second thread is operation on a second virtual machine.

In another aspect, the invention is a computer-implemented method forproviding signaling between threads on different virtual machines. Themethod includes receiving a notification that a first thread on a firstvirtual machine signaling an action on a method in the thread; andinforming one or more additional virtual machines, on which one or morerespective instances of the application are running and having one ormore respective threads of the signal by the first thread.

This Summary is provided to introduce a selection of concepts in asimplified form that are further described below in the DetailedDescription. This Summary is not intended to identify key features oressential features of the claimed subject matter, nor is it intended tobe used as an aid in determining the scope of the claimed subjectmatter.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1A illustrates a conventional Java application environment.

FIG. 1B illustrates a logical depiction of a clustering servertechnology discussed herein.

FIG. 1C illustrates a system in which a central manager facilitates datasharing among a group or cluster of virtual machines.

FIG. 2 illustrates various layers of control within a virtual machine.

FIG. 3A illustrates a method for identifying and sharing managed objectsamong virtual machines.

FIG. 3B illustrates a method for defining transactions within thecontext of the technology

FIG. 3C illustrates transaction boundaries within a code segment.

FIG. 4 illustrates a representation of an object graph of managedobjects.

FIG. 5 illustrates an example of managed objects, including classes andfields.

FIG. 6 illustrates a method for distributing object operations and dataamong virtual machines.

FIG. 7 illustrates the sharing of object data from a first virtualmachine, in an initial update, using operation logs of the first virtualmachine, and an operation log of a central manager.

FIG. 8 illustrates the sharing of object data from a first virtualmachine, in an incremental update, using an operation log of the firstvirtual machine.

FIG. 9 illustrates a method for sharing of logical operations includingfield level object data and logical collections among virtual machines.

FIG. 10 illustrates a method for sharing object identity among virtualmachines.

FIG. 11A illustrates a method for providing clustered locking.

FIGS. 11B-11D illustrate the signaling occurring in FIG. 11A.

FIG. 12A illustrates a method for providing greedy locking.

FIGS. 12B-12D illustrate the signaling occurring in FIG. 12A.

FIG. 12E is a state machine description of the method of FIG. 12A.

FIG. 13 illustrates a method for distributing thread signaling amongstvirtual machines in a cluster.

DETAILED DESCRIPTION

The technology described herein includes a set of integrated componentsthat provides a common virtual machine capability for applicationprograms running on distributed systems each having its own localvirtual machine. The components discussed herein allow transientdata—data actually stored in memory in a virtual machine as part ofin-memory object state—to be shared across various virtual machines. Ina unique aspect, object state is shared through a series of sharedoperations, either logical or physical operations, which are detectedand distributed as a series of transactions in order to replicate thecurrent state of objects at any of the virtual machines throughout acluster.

The technology will be described with respect to its application inconjunction with Java applications and Java Virtual Machines. However,it should be recognized that the inventive concepts have broaderapplicability to other virtual machine and development environments.Moreover, the managed objects utilized on various virtual machines neednot be shared by the same application. Finally, as explained herein,respective virtual machines need not operate concurrently.

FIG. 1B is a block diagram depicting a logical representation of thetechnology discussed herein. FIG. 1B illustrates three processingsystems each including an application 40, 50, 60 operating on a localvirtual machine 42, 52, 62. The technology discussed herein provides aclustering server 75 which extends the capabilities of each virtualmachine to all other processing devices in a given cluster. Thisincludes sharing data amongst each of the virtual machines in a clusteron objects identified by a cluster administrator via a managementinterface. In this manner, the data sharing functionality can be easilyadded to application software which was designed for use on a singlevirtual machine.

This allows various features to be provided by the technology, includesharing of object state between virtual machines, flexible locking whichis configurable at run-time, distributed method invocation anddistributed wait-and-notify. Benefits include distribution of data amongthe virtual machines without the need to maintain state in a database,transparent recovery from application instance failures, clustering asan API-free infrastructure service, reduced development, testing, andmaintenance costs, faster application development and scale-out, andfine-grained operations performance visibility and control. With thedata sharing functionality provided, there is no API for the applicationdeveloper to learn, apply, test, and maintain since the data sharingagents/libraries provide this transparency. Lower system life-cyclecosts are another benefit, since organizations using the system need notspend time writing code in order to provide clustering capabilities. Thesystem accomplishes this as an infrastructure software component thatprovides clustering as a service based on user-defined configurationsettings which can be changed at production-time. This allows anapplication to adapt its reliability, availability and scalabilitycharacteristics to changing user demands without the need for newdevelopment.

In many cases, the data sharing functionality enhances performance. Forinstance, in one approach, when a shared object is updated, only thefield-level changes are sent to the different virtual machines, and onlythose virtual machines with that object currently in memory are suppliedwith the update in real time. In another approach, the logicaloperations necessary to create a current object state are shared betweenvirtual machines. These techniques significantly reduce the amount ofdata movement, and improve all-around performance of a fully clusteredapplication. Moreover, the data sharing functionality provides thisinter-virtual machine communication capability in a scalable manner thatcannot be matched by peer-to-peer products.

FIG. 1C illustrates an exemplary implementation of the technology in aclustered system. In this implementation, a central manager 140facilitates data sharing among a group or cluster of application servers100. A group or cluster, shown generally at 100, includes a number ofservers. This represents one embodiment of how applications are scaledto allow multiple servers to run respective instances of an application(111, 121, 131) for load balancing or to provide increased reliability,availability and scalability. In the present example, three servers areprovided, namely server “A” 110, server “B” 120 and server “C” 130. Theservers can be co-located or geographically distributed, andinterconnected by any type of network, such as a LAN or WAN, orcommunication link (not illustrated).

As used herein each server or processing system includes, for example,one or more processors capable of executing instructions provided onreadable media to perform the methods and tasks described herein. Theprocessing system may include a volatile memory, a mass storage deviceor other non-volatile memory, a portable storage device, one or morecommunications or network interfaces and various I/O devices. The abovedescribed processing system hardware architecture is just one suitableexample of a processing system suitable for implementing the presenttechnology.

The servers 110, 120 and 130 each include a separate instance of anapplication, for example, application instance “A” 111, applicationinstance “B” 121 and application instance “C” 131. Further, each serverincludes a virtual machine on which the application code executes,namely virtual machine “A” 112, virtual machine “B” 122 and virtualmachine “C” 132. For example, each virtual machine may be a Java VirtualMachine (JVM) which executes byte code of an application. In oneembodiment, the applications are the same application with differentinstances; in another embodiment, the applications call the sameinstances of the same classes of objects in their respective applicationcode.

Each instance of the application runs locally on each application serverand interacts with each virtual machine locally. Objects used by theapplication are created and maintained by the respective virtualmachines on each server. In accordance with the invention, theapplication code for each of the applications need not provide for theclustering operations described herein. In essence, the application isprepared to run on a single virtual machine and extended to the clusterby the technology discussed herein. In this regard, a series of managedobjects, which include a local instance of application object on eachserver, are identified and clustered by the technology.

A data sharing agent/library 113, 123 and 133 is provided on eachrespective server to provide functionality for sharing managed objectsamong the servers, as described in greater detail below. Files stored atthe data sharing agent/library are loaded into the associated virtualmachine upon start up to modify the application code when compiled intobytecode to provide the data sharing functionality. In particular, thedata sharing agents 113, 123 and 133 are responsible for performingbytecode manipulation to implement clustered object management in eachlocal virtual machine 112, 122 and 132. Each may include a lock managerthat deals with gaining access to objects under locks, a transactionmanager that creates a transaction log as described below, and an objectmanager. A communication manager may also be provided to enables thevirtual machines to communicate with the central manager. Thecommunication manager may include IP address and port information of thecentral manager.

Each server 110, 120, 130 may also include a conventional operatingsystem and memory for storage of data, such as data used by threads ofthe application instances.

A central manager 140 is provided to facilitate data sharing among thevirtual machines and, in particular, between the virtual machines onwhich the application instances run. The central manager 140 inconjunction with the data sharing agent/library 113, 123 and 133, actsas a “clustering server” for the applications 111, 121, 131. In essence,each application 111, 121, 131 sees one virtual machine, but with eachapplication instance seeing changes to objects made by other applicationinstances in the cluster. The central manager 140 includes a datasharing application 141 running in an operating system on the manager.The manager may be a separate physical server or may operate on a serverwith one of the virtual machines. The central manager 140 has theability to communicate with each of the servers 110, 120 and 130 toshare object state information.

The data sharing application 141 works in concert with the data sharingagent/libraries 113, 123 and 133 to distribute shared objects amongstthe cluster systems 110, 120, 130. The data sharing application 141 caninclude a lock manager, transaction manager, communication manager, anda persistence manager. The persistence manager is able to persist objectstate information to a CM object representation store. The lock managermanages access to distributed locks between the various virtualmachines. The transaction manager deals with moving data between membersof the cluster in coherent terms. The object manager deals with keepingtrack of which virtual machines have what objects and what version ofthe object. A communication manager which enables the central manager tocommunicate with the virtual machines.

The object representation store 144 includes a record of the managedobject states in accordance with the methods discussed herein. Becauseof the persistence of managed objects by the representation, each of theservers 110, 120, 130 need not operate concurrently.

Advantageously, the central manager 140 and data sharing agent/librariesare implemented by a set of infrastructure software (which may becommonly distributed) that can be installed on suitable processingsystem hardware.

Subsequent to the installation of agents 113,123 and 133, virtualmachines 112, 122 and 132 are essentially clients of the centralmanager. As such, virtual machines may be referred to herein as clients.It should be understood that FIG. 1C illustrates only one possibleimplementation of the technology. For example, in FIG. 1C, the centralmanager 140 can be provided on a server that is separate from theservers hosting the applications or database software, or may beprovided on one or more of the virtual machines. Although only onecentral manager is used in the present example, multiple managers onmultiple servers can be clustered together to make a highly-availablehub shared by many virtual machines, even across dispersed geographies.It is also possible to run multiple instances of an application atmultiple virtual machines on one server.

A management console 150 provides a graphical user interface whichallows an operator to configure and monitor the central manager 140.Optionally, the operator may define configuration files which areprovided to the data sharing agent/library to specify which objectsshould be shared. This configuration data allows various managed objectsto be included as managed objects or excluded as managed objects on eachof the virtual machines in a cluster. In essence, this provides a formof drop-in/drop-out functionality for the managed objects. Themanagement console can also be used to monitor: a) a current count ofunique managed object instances for each client, on a per-class basis;b) a creation rate of managed objects, where both global and per-clientcounts are provided; c) a rate at which objects are flushed from theclients to the central manager, on a per client basis; d) a rate atwhich objects are requested from the central manager by a client, on aper client basis; e) a rate at which objects are written to a persistentobject store by the central manager; f) a rate at which objects arerequested from the persistent object store by the central manager; g) aview of all currently managed roots and fields; h) a list of locks withone or more pending lock or upgrade requests; i) a list of applicationprocess threads currently waiting due to contended locks; j) anon-demand display of locks which are currently part of processdeadlocks; k) elapsed processing time and number of objects collected bythe central manager garbage collection process; and l) a rate at whichtransactions are committed to the central manager, whether both globaland per-client counts are provided.

FIG. 2 illustrates an application running within a virtual machine 210,and various mechanisms by which the data sharing agent/librariesinteract with an application on a given virtual machine. A virtualmachine 210 generally includes a number of class loaders 224. Abootstrap class loader 205 is provided by some implementations ofvirtual machines. In a Java Virtual Machine, each and every class isloaded by some instance of a class loader. Whenever a new JVM isstarted, the bootstrap class loader is responsible for loading key Javaclasses into memory first. The runtime classes are packaged inside of aruntime jar file. Normally, developers do not have access to details ofthe bootstrap class loader, since this is a native implementation. Forthe same reason, the behavior of the bootstrap class loader will alsodiffer across JVMs. Other class loaders 224 may also be provided. Theseinclude, for example, the Java extension class loader, and theapplication class loader, responsible for loading all of the classeskept in the path corresponding to the java.class.path system property.

In one approach, application code at the server (110, 120, 130) isinstrumented using files stored by the data sharing agent/libraries whenthe application code is loaded into the virtual machine. Where abootstrap loader 205 is utilized, a custom “boot.jar” file may be usedto replace the class definitions in the system dependent runtime .jarfile. Where the virtual machine technology 210 does not implement abootstrap class loader 205, this technique is not required. Other classloaders 224 are instrumented to allow the data sharing agent/libraryfiles to introduce the data sharing functionality into the applicationclasses. Class loaders enable the virtual machine 210 on any respectiveserver to load classes without knowing anything about the underlyingfile system semantics. Similarly, the class loader 224 allows theapplication 222 to dynamically load classes. The data sharingagent/libraries can inspect and, if activated, intercept API calls madeby the application 222. The scope of interception can be at a byte codelevel, in which case field updates, method calls, synchronization calls,and wait/notify calls, for instance, are visible and controllable atruntime. When alternative facilities, for example HotSwap or JVMTI, areprovided by the virtual machine, the data sharing agent/libraries canintroduce the data sharing functionality to application classes throughthese mechanisms. This technique allows the data sharing agent/librariesto delay and optimize the introduced data sharing functionality.

Note that the application source code remains unchanged, and in oneimplementation, no stored byte code is changed such that were one todecide not to run the clustering server, one can restart the applicationwithout enabling the byte code manipulation. As discussed more fullybelow, during this process, object classes specified as shared areidentified and instrumentation added to allow server locking and logicalchange tracking.

Due to the instrumentation of bytecode at the virtual machine level,another aspect of “drop-in/drop-out” capability is provided. That is,the data sharing functionality which is provided by the instrumentationcan be easily activated or deactivated by a system operator atapplication runtime. This drop-in/drop-out capability further allows thedata sharing functionality to be provided for existing applicationswithout modifying the application code to conform to an API, orproviding any such API. The developer can write an application for asingle virtual machine and configure the application to be transparentlyshared across multiple virtual machines. All that is required is theinstallation of the data sharing agent/libraries and the properconfiguration of opt-in parameters via the management console. Thedrop-in/drop-out capability also allows rapid assessment of the degreeto which the data sharing functionality can benefit a given application,what-if analysis of various opt-in sets, and the ability to switch thedata sharing functionality on and off at runtime. The drop-in/drop-outcapability also eliminates the need to use custom-developed or vendorframework clustering and memory sharing in new applications since theseneeds can be handled with no need for explicit code.

This data sharing functionality may alternatively be implemented in thebytecode interpreter natively. That is, while developers normally do nothave access to the bytedcode interpreter, virtual machine providers whodo have access to the bytecode interpreter may build the samefunctionality provided by instrumentation of the bytecode at theclassloader level directly into the virtual machine instead.

FIG. 3A illustrates a general method for identifying and sharing managedobjects among virtual machines. At block 300, an application begins itsexecution and at step 302 the application byte code is instrumentedprior to execution of any functions on objects, as described above. Atblock 305, the instrumentation identifies objects of the application forwhich state information is to be shared. In particular, these objectsare identified as root objects of an object graph (see also FIG. 4).These objects are identified based on an operator defined configurationidentifying which objects should be managed objects in the cluster.

In this step, the byte code instrumentation adds functionality to eachmanaged class transparently. Exemplary pseudocode representations ofthis functionality include a function lockmanager.getlock( ), atransactionmanager.starttransaction( ) and atransactionmanager.commitTransaction( ) and lockmanager.releaseLock( ).As will be explained below, the getlock and releaseLock functionsrequest, respectively, a lock from the central manager for thecluster-wide managed object via the lock manager process, and a lockrelease on the managed object from the central manager. Thetransactionmanager.starttransaction andtransactionmanager.commitTransaction functions are used to generatetransactions which communicate changes to the central manager. Thesefunctions surround the application code, as described below.

At block 310, an object graph of each of the identified root objects isnavigated to identify the objects that can be reached from the rootobject. For example, an object is reachable by the root object if thereis a field assignment of an object reference into one of the root'sfield values at runtime. At block 315, the objects in the object graphare identified as managed objects, such as by flagging the objects.Thus, the root object and the objects reachable from the root objectbecome managed objects. Optionally, the operator can use the managementconsole to selectively exclude objects which are reachable from a rootobject from being managed by declaring a field to be transient.

In one aspect, the manager allows specification of root objects tomanage all objects accessible by the root. An object is reachable by theroot object if it is part of the object's reference graph, such as, forexample, where there is a field assignment of an object reference intoone of the root's field values at runtime.

FIG. 4 illustrates a representation of an object graph of managedobjects. The object graph 400 includes a root object and a number ofobjects, shown as circles, which are reachable from the root object, asindicated by the connecting arrows. An object pointed to by anotherobject is reachable from that object. A specific illustration isprovided below in connection with FIG. 5.

An object graph includes a root object and objects that are reachablefrom the root object. A root object can be a long-lived object, such asa cache implemented using native Java collections, a servlet session, ora hash map, an example of which is provided by the Java class HashMap.For example, the operator can configure managed objects using aconfiguration file in a known format, such as XML, or alternatively usethe management console to identify the managed objects. Moreover, notethat not all objects in an application need be managed. Only a subset ofall objects used by an application need to be identified as managed. Amanaged object is a distributed object whose state is maintainedconsistently at the different virtual machines in a cluster of virtualmachines. Generally, it is desirable to manage objects that representpure state information, while avoiding replicating objects that refer tooperating system resources. For example, business objects such ascustomer records might make good managed objects.

For example, an XML configuration file at the data sharingagents/libraries may modify values of a “<root>” element. The operatorspecifies the fully qualified field, and a name to associate with thefield. To illustrate, the following configuration sets up two objectsfor sharing—“exampleField1” and “exampleField2”, which are members ofthe “ExampleClass1” and “MyClass2” classes, respectively: <roots> <root>   <field-name>ExampleClass1.exampleField1</field-name>  <root-name>exampleRoot1</root-name>  </root>  <root>  <field-name>MyClass2.exampleField2</field-name>  <root-name>exampleRoot2</root-name>  </root> </roots>

Alternatively, roots can be given a common “name” even though they maybe two differing fully qualified field names. In this case, the two ormore root fields that share the common name will refer to the sameobject instance. Hence, one can, in two different classes, bind the sameroot to different variables. In terms of the example, even though thereare two different fields in different classes, even though they aredifferent fields, if they share a common name, they will be the same setof objects.

The object manager in the client can dynamically prune in-memoryversions of a managed object graph so that only portions of the managedgraph need be stored in the client virtual machine's memory at a time.This allows arbitrarily large shared object graphs to be fit into aconstrained memory footprint in the client virtual machines. Prunedsegments of the object graph can be faulted in from the serverdynamically as needed as code on a virtual machine traverses the managedgraph and follows a reference to an object that has been pruned. Thisprocess happens automatically and transparently to the user code. Asthis happens, the permanent representation of the managed object graphis unaltered in the central manager.

Returning to FIG. 3A, at block 320, the instrumented application beginsrunning and, at block 325, the instrumentation detects operations, suchas method calls and field set operations, at a given virtual machine onwhich the instrumented application is running, that affect the states ofthe managed objects. The process of detection at step 325 is furtherdetailed with respect to FIGS. 3B and 3C.

At block 330, information identifying the operations, such as the methodcalls and field set operations, and the central manager level (orglobal) unique identifier of the object or objects involved, iscommunicated from the virtual machine to the central manager and, atblock 335, the central manager uses the information to update arepresentation of the managed objects' states locally and at othervirtual machines.

The central manager may assign global identifiers to the managed objectsso that it can recognize any managed object in the cluster.Conventionally, only locally specific, non-deterministic identifiers areassigned to objects by the virtual machines. In accordance with thetechnology herein, when a new managed object is created on a localvirtual machine, a global unique identifier is assigned to the object bythe virtual machine on which the object is created. A group of centralmanager level unique identifiers is provided by the central manager toeach virtual machine.

Updates to the fields of a managed object are tracked at a fine grainedlevel of granularity and pushed to other virtual machines via thecentral manager. By joining a root graph, an object is flagged asmanaged and its state is kept up-to-date across a cluster of servers.

FIG. 3B illustrates a method for sharing the transaction data involvingobject data among virtual machines. At block 345, optionally, adetermination is first made as to whether a given method is synchronizedor a named lock is identified for the method, and at step 350, whetherthe lock has been acquired. Acquiring a lock is optional depending onhow an operator chooses to configure it. Transactions can be createdunder concurrent locks in which case no locks are acquired. This may beused in the case where potential write-write conflicts are tolerable. Atstep 355, the application begins operation on the locked code. At block360, a transaction log starts recording operations which are performedby the thread which affect the states of managed objects at a firstboundary in the code. At block 365, the transaction records alloperations until block 370, at which point the transaction is concludedwhen the thread crosses a second transaction boundary. At block 375 thetransaction is stored until forwarded to the central manager. At step380 the lock (if any) is released.

In one case, transactions can be provided on both method and Javasynchronization boundaries, where a transaction is a set of changes tomanaged objects made between defined transaction boundaries.Transactions are associated with, and protected by, zero or more locks.Transactions and locks function as a multi-virtual machine extension ofstandard Java synchronization and the Java memory model. Javasynchronization provides exclusive thread access to sections of code onmonitor/lock enter, and flushes local changes to main memory on monitorexit. In a cluster, locks provide a user-defined, cluster-wide accesspolicy to sections of code, and local changes are flushed to the centralmanager at the close of a transaction. In this way, threads in a clusterof multiple virtual machines can interact in the same way that they doin a single virtual machine.

This is illustrated by FIG. 3C where the transaction boundaries need notbe the same as the lock boundaries. For a synchronized block of codethat is synchchronized on managed object A, a first lock is required anda first transaction boundary (startTransaction(P)) begins afteracquisition of the first lock. Where a nested synchronized block of codethat is synchronized on managed object B (synchronized(B)) exists, thetransaction boundary for the first transaction P is completed and asecond transaction started for the nested synchronized block of code.The transaction boundaries in this context are synthesized by theinstrumentation of the byte code (or within a suitably enabled virtualmachine) to provide transaction boundaries which are granular to theparticular functions enumerated in the application code. Eachtransaction is thus defined (in the Java context) in terms of a threadmonitor enter and monitor exit in a code block. For named locks, thetransaction is defined in terms of a method boundary.

FIG. 5 illustrates an example of managed objects, including classes andfields. The managed objects include a root object 510 “users” and anumber of objects which are reachable from the root object, including anobject 520 named “myCache”, an object 530 named “User”, and an object540 named “Address”. The object 530 has the fields “Name”, “Age” and“Address”. The object 540 named “Address” is reachable from the“Address” field of the object 530, and includes fields “Street”, “State”and “Zipcode”. The objects provided could be used by a web-basedapplication, for instance, which requires a user to provide his or hername, age and address. Note that there is nothing special about the rootobject 510 or object 520; any object can be identified as a managedobject, (except objects that represent JVM-specific or host machinespecific resources, such as network sockets or file descriptors).

In this example, a users object references a map called mycache. Onceone establishes a reference that the cache is managed, then the entiresub-graph of an object is managed. That is if mycache is managed, as aroot, everything it points to is also managed Note that Java primitivesmay also be assigned object IDs also. Once a managed object has areference to an unmanaged object, it makes everything that it referencesbecome managed.

FIG. 6 illustrates a method for sharing object information among virtualmachines using operation logs. In a unique aspect of the technology,object data can be shared logically and physically, depending on theoperation on the object by an application. By sharing data usingoperations on any individual local object, each virtual machinemaintains a locally specific representation of object state. To do this,the steps which were taken by a virtual machine to get its memory tostore object data are detected and logged, and those steps are thenperformed at another virtual machine. For example, consider that eachvirtual machine typically assigns a locally generated identifier foreach instantiated object. When information associated with the object,such as field level data is stored, the object identifier, as a key, ishashed to determine a location (bucket) in a hash map in which theinformation will be stored. However, since each virtual machine uses adifferent local identifier for its local instance of the same object,each virtual machine's hash map will differ even though each hash maprepresents the same object state. Thus, physically copying the hash mapdata in one virtual machine's memory, bit by bit, to the memory ofanother virtual machine, would not successfully copy the underlyingobject state information. A specific technique for achieving logicalsharing overcomes this problem, as follows.

As noted above in FIG. 3A, when application operations occur at step 320accessing or affecting a managed object, those operations are detectedat step 325 and communicated to the central manager at step 330. Thevirtual machine (in this example VM1) is responsible for updating andmaintaining its own local representation of object state at step 610.VM1 maintains a local representation of the states of the objects whichare instantiated by the application, including managed and non-managedobjects. This is a base function of the virtual machine.

Step 325 is performed by recording, for example, the method calls orfield set operations that the application code has performed. Instead ofkeeping track of the actual object references, the transaction log keepstrack of the actions the application has done. Every time a centralmanager need to create the object in a new VM or to make changes to it,it can replay this log. For each action, such as when a new object iscreated or a function (such as a put call) is performed, this logicalaction is recorded and the physical steps written into a transaction.Any new objects and their data is now recorded in the log.

These transactions are stored in one format in the memory in the virtualmachine, then transmitted to the central manager (in, for example, aserialized format) in the message in the communications layer anddeserialized at the central manager.

At step 330, VM1 updates the central manager. The updating may occurfrom time to time at various points in the execution of the applicationcode. For example, the updating may occur after the application updatesone or more managed objects under a lock, as discussed above. As notedbriefly above, the instrumentation added to the application code mayinclude a transactionmanager.committransaction( ) which takes the logbuilt up in the transaction through this whole process, and communicatesit to the central manager. The shipping may occur immediately or in agrouped set of transactions, such as in a batch job.

To perform the update, VM1 communicates a log, VMLog, to the centralmanager. VM1 may delete the log and start a new log when the centralmanager confirms receipt of the log. Any type of network communicationtechnique may be used. As mentioned, the data sharing agent/library ateach virtual machine may include functionality for communicating withthe central manager.

At block 630, the central manager processes the transactions stored inthe VMLog to update a local representation of the states of the managedobjects. Essentially, the operations such as method calls, withassociated field values, which were performed at VM1, are stored in adata structure on the central manager.

A description of each object is provided on the central manager. Thisdescription includes meta data defining the object's class, its fieldsand the field values. For a physical object, for example a class“myclass” with four fields, the server description includes the serverclass and IDs for each field. For example, a physical object includesthe name of the class, the name of the class loader, fieldname, fieldvalue pairs for literal fields, field name and referenced object IDpairs for reference fields, object version, and possibly otherinformation. For logically managed objects, one needs to know what to dowith changes which may have occurred. A description of a logicallymanaged object in includes, for example, where the logically managedobject is a map, the contents of which may be a set of keys, collectionof values, or set of key-value mappings. The order of a map is definedas the order in which the iterators on the map's collection views returntheir elements. For this example of a logically managed object, arepresentation of the map is kept on the central manager. Therepresentation relates the object ID or literal keys to thecorresponding object ID or literal values. In addition, a logical action(such as a put) is assigned a function ID which is interpreted by thecentral manager allowing the central manger to create the appropriatemapping of keys to values. In the case of other logically-managedclasses, such as a list, examples of logical actions are add and remove;for a map, actions includes puts and gets; any number of logicallymanaged actions may be stored in this manner. The central manager'srepresentation is not another instance of each managed object, butmerely a representation of that object.

Each logical action performed on a logically managed object isidentified and the data associated with the logical action provided tothe central manager to update its representation. These logical actionsare passed to any other virtual machine in the cluster that currentlyhas said logically-managed object in its memory so they may be replayedagainst its local instance of said managed object.

At block 635, the central manager updates the other virtual machines inthe cluster so that the state of the managed objects at VM1 isreplicated at the other virtual machines. As noted above, depending onwhether the update is of a physically managed object or a logicallymanaged object, the transaction may have a slightly different format. Inaddition, there are two different scenarios for an update depending onwhether or not the update to the other virtual machines is an initialupdate (decision block 640 ).

An initial update occurs when the central manager first updates avirtual machine, in which case it is necessary to convey the currentstate of the managed objects to the virtual machine. This may occurafter application startup or after a new virtual machine joins thecluster. In one approach, the central manager can store each of the logsreceived from VM1 (or any of a number of VMs) and provide them to theother virtual machines to be played. However, this approach isinefficient as many operations may change the same managed objectsrepeatedly. Since only the most current state of a managed object isrelevant, and not the previous states it traversed to reach the currentstate, it is more efficient for the central manager to generate a log ofoperations (central manager log) from its representation of object state(block 645 ). This approach is more efficient since only the operationswhich are necessary to reach the current object state are generated. Atblock 650, the central manager communicates the central manager log tothe other virtual machines and, at block 655, the virtual machines playthe central manager log to update their local representations of objectstate. The operations in the central manager log, such as method callsand field set operations with associated values, are performed at theother virtual machines so that the state of the managed objects at thecentral manager, and at VM1, is replicated at the other virtualmachines. An object graph at the other virtual machines is therebyupdated so that it is a replica of the object graph at the centralmanager and at VM1.

If an initial update of a virtual machine has already been performed,then the subsequent updates can be incremental updates. In this case,the central manager conveys the virtual machine log from VM1 to theother virtual machines (block 660), and the other virtual machines playthe virtual machine log to update their local representations of objectstate block 665). Again, the object graphs at the other virtual machinesare thereby updated so that they are a replica of the object graph atthe central manager and at VM1. The updating of the other virtualmachines by the central manager may occur from time to time. Forexample, the central manager may update the other virtual machines whenit receives an update from VM1.

Note that the process shown in FIG. 6 is performed at each of thevirtual machines in a given cluster, independent of the processes onother servers. Thus, the central manager receives logs from thedifferent virtual machines and communicates the virtual machine logs, orlogs generated by the central manager, to the appropriate virtualmachines to maintain a consistent representation of the states of themanaged objects across all of the virtual machines. Furthermore, bymaintaining current state information locally, the central manager canupdate new virtual machines which are added to a cluster, and virtualmachines which come back online after being taken offline, such as formaintenance.

Initial and incremental updates are illustrated further, as follows, inFIG. 7 and FIG. 8, respectively.

FIG. 7 illustrates the sharing of object data from a first virtualmachine, in an initial update, using operation logs of the first virtualmachine, and an operation log of a central manager. Here, a virtualmachine “A” 710 sends a number of virtual machine logs to the centralmanager 740 over time, as indicated by paths 712. When an initial updateof one or more of the other virtual machines is needed, the centralmanager generates its own log of operations, central manager log, andsends it to the other virtual machines, such as virtual machine “B” 720and virtual machine “C” 730 via paths 722 and 732, respectively. Thus,one central manager log can represent the changes to object state whichresult from multiple virtual machine logs.

FIG. 8 illustrates the sharing of object data from a first virtualmachine, in an incremental update, using an operation log of the firstvirtual machine. Here, a virtual machine log sent from virtual machine“A” 810 to the central manager 840 via path 812 is relayed to the othervirtual machines, namely virtual machine 820 and virtual machine 830,via paths 822 and 832, respectively. That is, the central managerprovides a communication to virtual machine 820 and virtual machine 830which includes the information from the virtual machine log provided byvirtual machine 810. In an alternative, peer-to-peer embodiment of thetechnology, the virtual machine log from virtual machine 810 could besent directly by virtual machine 810 to the other virtual machines 820and 830 rather than being relayed by the central manager.

FIG. 9 illustrates a method for sharing of field level object data andlogical operations among virtual machines. As noted above, field levelsharing of object data as well as sharing logical operations are uniqueaspects of the technology.

By sharing object data at a field level of granularity, it is possibleto share changes to object state at a fine grained level. That is,changes to specific fields of managed objects can be shared amongvirtual machines without sending unnecessary information regardingfields of managed objects which have not changed, or fields of unmanagedobjects. This approach minimizes the amount of information which needsto be communicated between the central manager and the virtual machines.For example, referring to the “Address” object 540 in FIG. 5, assume the“Street” field is updated to a value of “123 Main Street”. In this case,it would only be necessary to provide updated values, in order for thecentral manager and the other virtual machines to update theirrepresentations of object state. There is no need to share the otherfields of “Address”, such as “State” and “Zipcode”, which did notchange. Nor is there a need to share the states of objects from whichobject 540 can be reached, such as objects 510, 520 and 513, which alsodid not change. An example process for sharing of field level data amongvirtual machines follows.

When an operation on a managed object occurs in the application (as instep 320 previously described), transactions are created at step 902 inaccordance with the foregoing description of steps 1315, 1320 and 1325.The information transmitted will depend on whether the object is aphysically managed object or a logically managed object (step 904 ). Ifthe object is a physically managed object, at block 915, field levelchanges to the managed objects are provided in the transaction. That is,the changes are detected at a field level of granularity. This mayinclude, e.g., detecting field level data affected by an applicationfunction. At block 920, a central manager uses the field level data toupdate its local representation of object state. The informationprovided at step 915 may include for a physical object, the name of theclass, the name of the class loader, fieldname and field value pairs forliteral fields, field name and referenced object ID pairs for referencefields, object version, and possibly other information, as discussedabove.

At block 925, to perform an update of any other VM, the central managercommunicates the field level data to the other virtual machines in thecluster and, at block 930, the other virtual machines use field leveldata to update respective local instances of the managed objects.

Similarly, if the transactions affect logically managed objects, thetransactions include logical operations at step 935. At block 940, acentral manager uses the method calls and other logical operations toupdate its local representation of object state. At block 945, toperform an update of any other VM, the central manager communicates thelogical operations to the other virtual machines in the cluster and, atblock 950, the other virtual machines replay those logical operationsagainst their respective instances of the managed objects to update thestate of those managed objects.

FIG. 10 illustrates a method for sharing object data among virtualmachines while maintaining object identity. In a unique aspect of thetechnology, where conventionally objects would be distributed bymaintaining additional copies of objects in, for example, a clusteredMap, the sharing technology maintains the unique identity of managedobjects by eliminating the need to copy manage objects themselves.

As noted above, when application operations occur at step 320 accessingor affecting a managed object, those operations are detected at step325. Each virtual machine (in this example VM1) is responsible forupdating and maintaining its own local representation of object state atstep 610. VM1 maintains a local representation of the states of theobjects which are instantiated by the application, including managed andnon-managed objects. The VM updates any change to a local instance of amanaged object at step 1015.

Step 325 is performed by recording, for example, the method calls orfield set operations that the application code has performed. Instead ofkeeping track of the actual object references, the transaction log keepstrack of the actions the application has done. For each action, at block1025, data identifying the operations of a changed object is includedwith the transaction. That is, for each transaction, an object ID isgenerated at the client and, as noted above, is provided as part of thefield data for a physically managed object, as well as the operationsdata for a logically managed object. Object references are thusmaintained in the local representation at the central manger and at anyother VM using the CM log to update its local representation of theobject. At block 1030, the central manager is updated. To perform theupdate, VM1 communicates data identifying the object and the logicaloperations to the central manager. At block 1035, the central managerupdates its local representation of managed objects using the object IDand transaction ID data. At block 1040, the central manager communicatesdata identifying the operations to the other virtual machines and, atblock 1045, the other virtual machines update existing instances of thechanged objects without creating new instances of the objects. With thisapproach, object identity is maintained across the virtual machines.

The central manager also provides various cluster wide lockingfunctionality. In one embodiment, both named manual locks (named locks)and automatic locks (auto locks) are provided. Clustered locks can spanan arbitrary number of VMs. For automatic locking, the CM globally locksany point where the application code uses, for example, the Java“synchronized” keyword, to provide distributed locking for applicationswhere “synchronized” is already in use or where the application ismulti-thread safe. Named locks can be used with help from developers forapplications that were never designed to be distributed andmulti-threaded. Named locks specify which blocks of code in anyapplication should be locked globally.

Both auto locks and named locks are available in two different modes,clustered locks and greedy locks. With clustered locks, the virtualmachine obtains a lock explicitly from the central manager each time alock is needed.

FIGS. 11A-11D illustrate clustered locking.

At block 320 the application in the course of performing operations willrequest a lock on a managed object A virtual machine may request a lockwhen it encounters a block of code which uses the Java key word“synchronized”, as mentioned previously. Alternatively, the operator mayuse the management console to designate a block of code of anapplication which does not use the keyword “synchronized” as a methodwhich invokes a request for the lock.

In one embodiment, both named manual locks and automatic locks, areimplemented. Administrators can use automatic locks, which globally lockany point where the application code uses, for example, the Java“synchronized” keyword, to provide distributed locking for applicationswhere “synchronized” is already in use or where the application ismulti-thread safe. Named locks can be used with help from developers forapplications that were never designed to be distributed andmulti-threaded. With named locks, users can specify which blocks of codein an application should be locked globally.

At block 1115, VM1 sends a request for a lock to the central manager. Atblock 1120, the central manager accesses its records to determine if thelock on the object is currently available. For example, the centralmanager may maintain a record listing object identifiers and associatedlock status, indicating whether there is a lock on an object and, ifthere is a lock, which thread in which virtual machine has the lock, andthe type of lock, e.g., read, write or concurrent. A read lock allowsall instances of the application on the different virtual machines tohave concurrent read access but not write access to managed objectswithin the scope of the given lock. A write lock allows one thread onone virtual machine to have read and write access to managed objectswithin the scope of the given lock, but prevents any other thread in anyother virtual machine from acquiring the given lock. A concurrent lockallows multiple threads on multiple virtual machines to make changes tomanaged objects at the same time. This lock maintains a stable viewwithin the transaction that the lock protects but allows write-writeconflicts between threads in the same or other virtual machines.Concurrent locks should be used when performance is more important thenthe possibility of write-write conflicts. In the case of a write-writeconflict, the last writer wins.

At decision block 1125, if the lock is not available, the centralmanager waits until the lock becomes available. When the lock isavailable, the central manager grants the lock to the requesting virtualmachine, at block 1135, and updates its records accordingly. Ifapplicable, at block 1140, the central manager blocks any other threadin the same or other virtual machines from taking control of the lock.At block 1145, the virtual machine may perform any operation under thelock locally without CM interaction.

After using the lock, the virtual machine informs the central managerthat it is releasing it, at block 1150. At block 1155, the centralmanager updates its records accordingly, and grants the lock to the nextthread in contention for that lock in any connected virtual machine, ifany. That is, the lock is granted to any other thread in any virtualmachine that is blocked in contention for the lock.

In another alternative, the lock is a “greedy” lock, in which case thevirtual machine holds the lock so that threads local to that virtualmachine may acquire and release the lock repeatedly (at step 1145 )without communicating with the central manager until the central managercommands it to release the lock. With a greedy lock, VM1 holds the locknot only for the duration of one synchronized block, but until the lockis recalled by the central manager, such as if another virtual machinerequests the lock from the central manager.

FIGS. 12A-12D illustrate the operation of a greedy lock. Recall with aclustered lock that VM1 releases the lock after a specified duration ofwork, one and only one block of code protected by that lock. With agreedy lock, VM1 may continue to process as many blocks of codeprotected by that lock as it needs, in a local lock context, until thelock is recalled by the central manager, such as if another virtualmachine requests the lock from the central manager.

FIG. 12A is equivalent to FIG. 11A up to step 1135. FIG. 12A may be readin conjunction with illustrations in FIGS. 12B-12D. Once a greedy lockis granted to VM1, at step 1240, VM1 holds the lock and may access andrelease the lock locally without CM interaction. At step 1245, the CMreceives a request for the lock from another VM. At step 1250, thecentral manager will request that VM1 release the lock and when VM1releases the lock at step 1255, the CM updates its records accordinglyat step 1260 and grants the lock to the requesting VM.

FIG. 12E shows a greedy lock state machine. This diagram uses two VMs asa simplification. The state may be initialized at the Lock Requestedstate in VM1. The sole exit transition is to the lock state maintainedin the central manager. Two transitions can exit this state, No OthersIn Contention, meaning no other virtual machines are in contention forthe lock, or Others in Contention, meaning other virtual machines are incontention for the lock. If the No Others In Contention state is true,the VM will transition to the Lock Entered state, transition to the LockComplete state, and back to the Lock Requested state. From here thetransitions and states remain the same for VM1, in a loop, until suchpoint where another VM requests the lock and the Others In Contentiontransition is followed out of the Lock Requested state to the Blockedstate on the central manager. At this point the Central Manager blocksthe VM1 from moving to another Lock Entered state and instead hands theGreedy Lock to the VM2. VM1, which then can enter its own series of LockRequested—check Lock—Lock Entered—Lock Complete state transitions untilanother VM requests the greedy lock.

The following pseudo-code provides a further illustration of theconcepts described herein. Assume the following pseudo-code representsapplication code which has been instrumented in accordance with thediscussion of step 305. In this example, a new thread is adding a newobject and the agent will traverse the graph of the person object andmake all the objects it refers to managed and give them all object IDs.A record of these new objects is placed into the transaction log whichwill be forwarded to the central manager. Note that each VM gets a batchof central manager level object IDs ahead of time so that it can assignthem to the objects. For example, each VM may get any number of newobject IDs that it can assign locally at will. If it runs out of objectIDs, it can request additional IDs from the central manager. Also notethat the VM's internal object ID does not affect the central manager orthe central manager level Object ID. The code below may be operated onby a thread on any virtual machine. Class Cache { /* Define the object“cache” */  Map myCache=new HashMap( )   /*Define the object “myCache”as a empty HashMap*/    public void put(String name, User user) {     /*Call the “put” method */    synchronized(myCache) {    /* Request lockon myCache at virtual machine */     lockManager.getLock(myCache)     /* Request lock on myCache from central     manager (this code isadded by instrumentation) */     transactionManager.startTransaction( )     /* Start a transaction at the virtual machine     (this code isadded by instrumentation) */     myCache.put(name, user)      /* Callthe put method for “myCache” */     transactionManager.commitTransaction( )      /* Commits the transaction log to the central      manager(this code is added by      instrumentation )*/    lockManager.releaseLock(myCache)      /* Release the lock on myCacheat virtual     machine (this code is added by instrumentation) */    }  }  }

First, assume that at VM1 there are two threads active both asking for alock on the object “myCache”. In this example, the object myCache hasbeen identified as a managed object. A first thread will be granted thelock by virtual machine VM1. Next, the data sharing agent indicates tothe central manager that there is a thread on VM1 that has requested alock on myCache. The agent requests a lock from the server. If no othervirtual machine has the lock, the central manager will grant the lock tothe VM.

At this point, the application code at VM1 is able to move on. In thevirtual machine, the agent starts a transaction for this thread. Theagent will now keep track of logical actions and field changes tomanaged objects performed by thread one. This occurs whenever a threadobtains a lock on a managed object.

Once this first thread has received a lock and started its transaction,now it is able to execute the operations in the protected block of codeas originally defined by the application code. Suppose, for example, asecond VM with another thread trying to execute the same block of codeprotected by the same lock on a different virtual machine. There are nowtwo threads locally that are synchronized by VM1, and a third thread onVM2 trying to access the same object. The native lock manager of VM2will allow this lock, but when the function “getLock” is performed onVM2, the central manager will not grant VM2 the lock because it isalready held by thread one in VM1.

Threads two and three are blocked trying to get a lock. Thread two is onthe same VM as thread one, so it is blocked trying to get the VM1 objectmonitor from the native lock manager of VM1. Thread three has been giventhe local monitor on VM2 but is blocked trying to get the clustered lockfrom the central manager.

The application code can then perform the put operation (in thisexample) on the object once the lock is granted. Once this is completed,the transactionmanager.committransaction( ) takes the log built up inthe transaction and ships it to the central manager. Next, since threadone is finished with the lock, the lockManager.releaseLock(myCache)releases the clustered lock. Thread one exits the protected block ofcode and has now completed its work.

Once thread one in VM1 has released the local lock, the native lockmanager in VM1 allows thread two to obtain the local lock. If thecentral manager grants thread two the clustered lock, thread twoexecutes the same block of code against another user object. While thatis happening, thread three at VM2 is still blocked in contention for theclustered lock from the server, even though it has been granted thelocal lock by the native lock manager in VM2. Thread three remainsblocked until thread two completes its execution of the protected blockof code and releases the lock. At such time, the central manager awardsthe clustered lock to thread three in VM2. Because thread three is in aseparate VM than threads one and two, the transactions created bythreads one and two must be applied, in order, at VM2 to bring thechanged managed objects up to date before thread three is allowed toexecute the protected block of code. Once the transactions created byVM1 under the scope of the clustered lock have been applied in VM2,thread three is allowed to execute the protected block of code. Whenthread three has completed the protected block of code, the transactionit created is committed and the clustered lock is released by threadthree. The clustered lock returns to its uncontended state.

FIG. 13 illustrates a method for signaling between threads in separatevirtual machines by extending thread signaling mechanisms built into thevirtual machine to have a clustered meaning. In a unique aspect of thetechnology, thread signaling, such as object.wait( ) and object.notify() and thread.join( ) methods in the Java Virtual Machine, is extended toapply to all threads in all virtual machines in the cluster. Asmentioned previously, synchronization of multiple threads on a singlevirtual machine is conventionally achieved using locks that allow onlyone of the threads in that virtual machine to execute a protected blockof code at a time. As mentioned previously, this conventional locking isextended to have a clustered meaning. In addition, a technique isrequired for signaling waiting threads which may be distributed acrossdifferent virtual machines.

For example, if a thread currently holds the lock on an object, it maycall “object.wait( )” which causes the calling thread to release thatobject's lock and pause execution. Another thread may then acquire thelock on that object. It may then call the Java method “object.notify( )”which will notify a single thread waiting on that object. It may alsocall the Java method “object.notifyAll( )” which will notify all threadswaiting on that object. Waiting threads that are notified in this wayresume execution and go back into active contention for that object'slock. While this is satisfactory in a single-virtual machineenvironment, a technique is needed for signaling threads on differentvirtual machines to coordinate the pausing and resuming of threadexecution. A technique is needed for extending existing thread signalingmechanisms such as Java's “object.wait( )” and “object.notify( )”methods to apply to all threads in all virtual machines in the clusteras they do to threads in the same virtual machine. See, for example,http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.8An example of such a technique follows.

In a unique aspect of the technology, the native thread signalingutilities of a virtual machine are extended to the cluster. These caninclude, in a Java context, synchronization (grabbing the lock in thefirst place), wait and notify, the Thread.join( ) method, and the like.In other virtual machine contexts, other thread signaling technologiesmay be extended.

In FIG. 13, this feature of the technology is described with respect tothe object.wait( ) and object.notify( ) utilities, but the technology isnot limited to these signaling utilities. At block 1300, instrumentedapplication byte code running in a thread at a first virtual machineimplements a synchronized call on managed object. After performing oneor more operations defined in the application code, an object.wait( )call is encountered in the code at step 1305. The thread will nowrelease the lock it has on that object, pause execution and awaitnotification at step 1320. Another thread may then acquire that object'slock. This thread may be on the same virtual machine as that of steps1300 or, in accordance with the technology, a different virtual machine.At step 1315, the second thread calls object.notify( ) and, assumingthat the thread is executing on a different virtual machine, the notifysignal is passed to the central manager at step 1318. At step 1325, thecentral manager distributes the notify signal to a waiting threads orall waiting threads.

Once the notification is sent, step 1320 is true and the first threadwill then request access to a lock at step 1330. Any other threads whichwere waiting on the notification will likewise resume execution andrequest access to the lock at step 1335. At step 1340, the centralmanager will perform lock management in accordance with the foregoingdiscussions.

Note that while example implementations are discussed in which virtualmachines run on servers, which is a suitable approach for storing largeamounts of data for a web-based application for instance, any type ofcomputing device may be used, including personal computers,minicomputers, mainframes, handheld computing devices, mobile computingdevices, and so forth. Typically, these computing devices will includeone or more processors in communication with one or more processorreadable storage devices, communication interfaces, peripheral devices,and so forth. Examples of storage devices include RAM, ROM, hard diskdrives, floppy disk drives, CD ROMS, DVDs, flash memory, and so forth.Examples of peripherals include printers, monitors, keyboards, pointingdevices, and so forth. Examples of communication interfaces includenetwork cards, modems, wireless transmitters/receivers, and so forth. Insome embodiments, all or part of the functionality is implemented insoftware, including firmware and/or micro code, that is stored on one ormore processor readable storage devices and is used to program one ormore processors to achieve the functionality described herein.

The foregoing detailed description of the technology herein has beenpresented for purposes of illustration and description. It is notintended to be exhaustive or to limit the technology to the precise formdisclosed. Many modifications and variations are possible in light ofthe above teaching. The described embodiments were chosen in order tobest explain the principles of the technology and its practicalapplication to thereby enable others skilled in the art to best utilizethe technology in various embodiments and with various modifications asare suited to the particular use contemplated.

Although the subject matter has been described in language specific tostructural features and/or methodological acts, it is to be understoodthat the subject matter defined in the appended claims is notnecessarily limited to the specific features or acts described above.Rather, the specific features and acts described above are disclosed asexample forms of implementing the claims.

1. A method for distributing thread signaling amongst virtual machines in a cluster, comprising: receiving a signal from a first virtual machine indicating a method call on a method in a first thread which is dependent on an action of a second thread; monitoring at least the second thread for completion of the action; providing a signal to the first thread on the first virtual machine indicating the action of the second thread; and wherein the second thread is operation on a second virtual machine.
 2. The method of claim 1 wherein action is a method call.
 3. The method of claim 2 wherein the method call is a synchronized method.
 4. The method of claim 2 wherein the method call is a wait method.
 5. The method of claim 2 wherein the method is a notify method.
 6. The method of claim 1 wherein the action is a join.
 7. A computer-implemented method for providing signaling between threads on different virtual machines, comprising: receiving a notification that a first thread on a first virtual machine signaling an action on a method in the thread; and informing one or more additional virtual machines, on which one or more respective instances of the application are running and having one or more respective threads of the signal by the first thread.
 8. The method of claim 7 wherein the notification is for completion of the action.
 9. The method of claim 2 wherein action is a synchronized method.
 10. The method of claim 2 wherein the action is a notify method.
 11. A computer-implemented method for distributing signaling between threads on different virtual machines, comprising: notifying a central manager that a first thread running at a first virtual machine is dependent on a second thread; receiving a notification from the central manager indicating that the second thread has performed an action; responsive to the notification, the first thread continuing operation; wherein the second thread is operating on a second virtual machine.
 12. The computer implemented method of claim 11 wherein the step of notifying includes entering a wait state.
 13. The method of claim 11 wherein notification is a notify.
 14. The method of claim 11 wherein the notification is a notify all.
 15. The method of claim 11 wherein the notification is a join.
 16. A computer-implemented method for providing signaling between threads on different virtual machines, comprising: receiving a signal from a first virtual machine on which a first thread is running; determining whether the signal should be distributed to other threads in a cluster of virtual machines; and responsive to the determining step, providing the signal to at least a second virtual machine.
 17. The computer-implemented method of claim 16 wherein the signal is a notify.
 18. The computer-implemented method of claim 16 wherein the signal is a notify all.
 19. The computer-implemented method of claim 16 wherein the step of determining includes identifying one or more managed objects affected by the thread. 